Validation of clinical problems using a UMLS-based semantic parser

نویسندگان

  • Howard Goldberg
  • Charles Hsu
  • Vincent Law
  • Charles Safran
چکیده

The capture and symbolization of data from the clinical problem list facilitates the creation of high-fidelity patient resumes for use in aggregate analysis and decision support. We report on the development of a UMLS-based semantic parser and present a preliminary evaluation of the parser in the recognition and validation of disease-related clinical problems. We randomly sampled 20% of the 26,858 unique non-dictionary clinical problems entered into OMR (Online Medical Record) between 1989 and August, 1997, and eliminated a series of qualified problem labels, e.g., history-of, to obtain a dataset of 4122 problem labels. Within this dataset, the authors identified 2810 labels (68.2%) as referring to a broad range of disease-related processes. The parser correctly recognized and validated 1398 of the 2810 disease-related labels (49.8 +/- 1.9%) and correctly excluded 1220 of 1312 non-disease-related labels (93.0 +/- 1.4%). 812 of the 1181 match failures (68.8%) were caused by terms either absent from UMLS or modifiers not accepted by the parser; 369 match failures (31.2%) were caused by labels having patterns not recognized by the parser. By enriching the UMLS lexicon with terms commonly found in provider-entered labels, it appears that performance of the parser can be significantly enhanced over a few subsequent iterations. This initial evaluation provides a foundation from which to make principled additions to the UMLS lexicon locally for use in symbolizing clinical data; further research is necessary to determine applicability to other health care settings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

برچسب‌زنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه

Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...

متن کامل

Towards comprehensive syntactic and semantic annotations of the clinical narrative

OBJECTIVE To create annotated clinical narratives with layers of syntactic and semantic labels to facilitate advances in clinical natural language processing (NLP). To develop NLP algorithms and open source components. METHODS Manual annotation of a clinical narrative corpus of 127 606 tokens following the Treebank schema for syntactic information, PropBank schema for predicate-argument struc...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies.

BACKGROUND The Unified Medical Language System (UMLS) is one of the largest biomedical terminological systems, with over 2.5 million concepts in its Metathesaurus repository. The UMLS's Semantic Network (SN) with its collection of 133 high-level semantic types serves as an abstraction layer on top of the Metathesaurus. In particular, the SN elaborates an aspect of the Metathesaurus's concepts v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. AMIA Symposium

دوره   شماره 

صفحات  -

تاریخ انتشار 1998